compressive imaging
- Europe > Switzerland > Zürich > Zürich (0.04)
- Asia > China > Zhejiang Province (0.04)
- Asia > China > Guangdong Province > Shenzhen (0.04)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
- Information Technology > Hardware (0.66)
Neural Proximal Gradient Descent for Compressive Imaging
Recovering high-resolution images from limited sensory data typically leads to a serious ill-posed inverse problem, demanding inversion algorithms that effectively capture the prior information. Learning a good inverse mapping from training data faces severe challenges, including: (i) scarcity of training data; (ii) need for plausible reconstructions that are physically feasible; (iii) need for fast reconstruction, especially in real-time applications. We develop a successful system solving all these challenges, using as basic architecture the repetitive application of alternating proximal and data fidelity constraints. We learn a proximal map that works well with real images based on residual networks with recurrent blocks. Extensive experiments are carried out under different settings: (a) reconstructing abdominal MRI of pediatric patients from highly undersampled k-space data and (b) super-resolving natural face images. Our key findings include: 1. a recurrent ResNet with a single residual block (10-fold repetition) yields an effective proximal which accurately reveals MR image details. 2. Our architecture significantly outperforms conventional non-recurrent deep ResNets by 2dB SNR; it is also trained much more rapidly.
- Europe > Switzerland > Zürich > Zürich (0.04)
- Asia > China > Zhejiang Province (0.04)
- Asia > China > Guangdong Province > Shenzhen (0.04)
- Asia > China > Guangdong Province > Shenzhen (0.04)
- Europe > Switzerland > Zürich > Zürich (0.04)
- Europe > Germany > Bavaria > Lower Franconia > Würzburg (0.04)
- Information Technology > Sensing and Signal Processing > Image Processing (1.00)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Learning Spectral Diffusion Prior for Hyperspectral Image Reconstruction
Yu, Mingyang, Wu, Zhijian, Huang, Dingjiang
Hyperspectral image (HSI) reconstruction aims to recover 3D HSI from its degraded 2D measurements. Recently great progress has been made in deep learning-based methods, however, these methods often struggle to accurately capture high-frequency details of the HSI. To address this issue, this paper proposes a Spectral Diffusion Prior (SDP) that is implicitly learned from hyperspectral images using a diffusion model. Leveraging the powerful ability of the diffusion model to reconstruct details, this learned prior can significantly improve the performance when injected into the HSI model. To further improve the effectiveness of the learned prior, we also propose the Spectral Prior Injector Module (SPIM) to dynamically guide the model to recover the HSI details. We evaluate our method on two representative HSI methods: MST and BISRNet. Experimental results show that our method outperforms existing networks by about 0.5 dB, effectively improving the performance of HSI reconstruction.
Snapshot Compressed Imaging Based Single-Measurement Computer Vision for Videos
Pan, Fengpu, Wen, Jiangtao, Han, Yuxing
Snapshot compressive imaging (SCI) is a promising technique for capturing high-speed video at low bandwidth and low power, typically by compressing multiple frames into a single measurement. However, similar to traditional CMOS image sensor based imaging systems, SCI also faces challenges in low-lighting photon-limited and low-signal-to-noise-ratio image conditions. In this paper, we propose a novel Compressive Denoising Autoencoder (CompDAE) using the STFormer architecture as the backbone, to explicitly model noise characteristics and provide computer vision functionalities such as edge detection and depth estimation directly from compressed sensing measurements, while accounting for realistic low-photon conditions. We evaluate the effectiveness of CompDAE across various datasets and demonstrated significant improvements in task performance compared to conventional RGB-based methods. In the case of ultra-low-lighting (APC $\leq$ 20) while conventional methods failed, the proposed algorithm can still maintain competitive performance.
- Asia > China > Guangdong Province > Shenzhen (0.04)
- North America > United States > New York (0.04)
- North America > United States > California > San Diego County > San Diego (0.04)
- Asia > China > Shanghai > Shanghai (0.04)
OTLRM: Orthogonal Learning-based Low-Rank Metric for Multi-Dimensional Inverse Problems
Wang, Xiangming, Zeng, Haijin, Chen, Jiaoyang, Liu, Sheng, Chen, Yongyong, Chao, Guoqing
This property is vital for multi-dimensional inverse problems, such as tensor completion, spectral imaging reconstruction, and multispectral image denoising. Existing tensor singular value decomposition (t-SVD) definitions rely on hand-designed or pre-given transforms, which lack flexibility for defining tensor nuclear norm (TNN). The TNN-regularized optimization problem is solved by the singular value thresholding (SVT) operator, which leverages the t-SVD framework to obtain the low-rank tensor. However, it is quite complicated to introduce SVT into deep neural networks due to the numerical instability problem in solving the derivatives of the eigenvectors. In this paper, we introduce a novel data-driven generative low-rank t-SVD model based on the learnable orthogonal transform, which can be naturally solved under its representation. Prompted by the linear algebra theorem of the Householder transformation, our learnable orthogonal transform is achieved by constructing an endogenously orthogonal matrix adaptable to neural networks, optimizing it as arbitrary orthogonal matrices. Additionally, we propose a low-rank solver as a generalization of SVT, which utilizes an efficient representation of generative networks to obtain low-rank structures. Extensive experiments highlight its significant restoration enhancements.
- Asia > China > Heilongjiang Province > Harbin (0.04)
- Africa > Senegal > Kolda Region > Kolda (0.04)
- Asia > China > Guangdong Province > Shenzhen (0.04)
- Media > Television (0.56)
- Leisure & Entertainment (0.56)
- Information Technology (0.46)
Reviews: Neural Proximal Gradient Descent for Compressive Imaging
While my concerns were given significant attention in the rebuttal, I feel they were not fully addressed. In particular, regarding comparison with deep ADMM-net and LDAMP, the authors argue that these methods need more training data/training time. However, training time is normally not a big issue (you only train your model once, does it matter if it takes 2 hours or 10?). The *size* of the training data is however important, but no experiments are provided to show superior performance of the proposed method with respect to the the size of training data. This is surprising given that in l. 62. the authors say they use "much less training data" (addressing the challenge of "scarcity of training data" mentioned in l.4 in abstract), without referring back to this claimed contribution anywhere in the paper!
Unfolding Framework with Prior of Convolution-Transformer Mixture and Uncertainty Estimation for Video Snapshot Compressive Imaging
We consider the problem of video snapshot compressive imaging (SCI), where sequential high-speed frames are modulated by different masks and captured by a single measurement. The underlying principle of reconstructing multi-frame images from only one single measurement is to solve an ill-posed problem. By combining optimization algorithms and neural networks, deep unfolding networks (DUNs) score tremendous achievements in solving inverse problems. In this paper, our proposed model is under the DUN framework and we propose a 3D Convolution-Transformer Mixture (CTM) module with a 3D efficient and scalable attention model plugged in, which helps fully learn the correlation between temporal and spatial dimensions by virtue of Transformer. To our best knowledge, this is the first time that Transformer is employed to video SCI reconstruction. Besides, to further investigate the high-frequency information during the reconstruction process which are neglected in previous studies, we introduce variance estimation characterizing the uncertainty on a pixel-by-pixel basis. Extensive experimental results demonstrate that our proposed method achieves state-of-the-art (SOTA) (with a 1.2dB gain in PSNR over previous SOTA algorithm) results. We will release the code.
Modeling Mask Uncertainty in Hyperspectral Image Reconstruction
Wang, Jiamian, Zhang, Yulun, Yuan, Xin, Meng, Ziyi, Tao, Zhiqiang
Recently, hyperspectral imaging (HSI) has attracted increasing research attention, especially for the ones based on a coded aperture snapshot spectral imaging (CASSI) system. Existing deep HSI reconstruction models are generally trained on paired data to retrieve original signals upon 2D compressed measurements given by a particular optical hardware mask in CASSI, during which the mask largely impacts the reconstruction performance and could work as a "model hyperparameter" governing on data augmentations. This mask-specific training style will lead to a hardware miscalibration issue, which sets up barriers to deploying deep HSI models among different hardware and noisy environments. To address this challenge, we introduce mask uncertainty for HSI with a complete variational Bayesian learning treatment and explicitly model it through a mask decomposition inspired by real hardware. Specifically, we propose a novel Graph-based Self-Tuning (GST) network to reason uncertainties adapting to varying spatial structures of masks among different hardware. Moreover, we develop a bilevel optimization framework to balance HSI reconstruction and uncertainty estimation, accounting for the hyperparameter property of masks. Extensive experimental results and model discussions validate the effectiveness (over 33/30 dB) of the proposed GST method under two miscalibration scenarios and demonstrate a highly competitive performance compared with the state-of-the-art well-calibrated methods. Our code and pre-trained model are available at https://github.com/Jiamian-Wang/mask_uncertainty_spectral_SCI
- North America > United States > California (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Switzerland > Zürich > Zürich (0.04)